For Valid Generalization the Size of the Weights is More Important than the Size of the Network

نویسنده

  • Peter L. Bartlett
چکیده

This paper shows that if a large neural network is used for a pattern classiication problem, and the learning algorithm nds a network with small weights that has small squared error on the training patterns, then the generalization performance depends on the size of the weights rather than the number of weights. More specii-cally, consider an`-layer feed-forward network of sigmoid units, in which the sum of the magnitudes of the weights associated with each unit is bounded by A. The misclassiication probability converges to an error estimate (that is closely related to squared error on the training set) at rate O((cA) `(`+1)=2 p (log n)=m) ignoring log factors, where m is the number of training patterns, n is the input dimension, and c is a constant. This may explain the generalization performance of neural networks, particularly when the number of training examples is considerably smaller than the number of weights. It also supports heuristics (such as weight decay and early stopping) that attempt to keep the weights small during training.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Quaternion Attitude Control of Aerodynamic Flight Control Vehicles

Conventional quaternion based methods have been extensively employed for spacecraft attitude control where the aerodynamic forces can be neglected. In the presence of aerodynamic forces, the flight attitude control is more complicated due to aerodynamic moments and inertia uncertainties. In this paper, a robust nero-adaptive quat...

متن کامل

Enhancing Efficiency of Neural Network Model in Prediction of Firms Financial Crisis Using Input Space Dimension Reduction Techniques

The main focus in this study is on data pre-processing, reduction in number of inputs or input space size reduction the purpose of which is the justified generalization of data set in smaller dimensions without losing the most significant data. In case the input space is large, the most important input variables can be identified from which insignificant variables are eliminated, or a variable ...

متن کامل

Estimation of the Active Network Size of Kermanian Males

Background: Estimation of the size of hidden and hard-to-reach sub-populations, such as drug-abusers, is a very important but difficult task. Network scale up (NSU) is one of the indirect size estimation techniques, which relies on the frequency of people belonging to a sub-population of interest among the social network of a random sample of the general population. In this study, we estimated ...

متن کامل

Design of Riprap Stone Around Bridge Piers Using Empirical and Neural Network Method

An attempt was made to develop a method for sizing stable riprap around bridge piers based on a huge amount of experimental data, which is available in the literature. All available experimental data for circular as well as round-nose-and-tail rectangular piers were collected. The data for rectangular piers, with different aspect ratios, aligned with the flow or skewed at different angles to th...

متن کامل

روش پیش‌تعلیم سریع بر مبنای کمینه‌سازی خطا برای همگرائی یادگیری شبکه‌های‌ عصبی با ساختار عمیق

In this paper, we propose efficient method for pre-training of deep bottleneck neural network (DBNN). Pre-training is used for initial value of network weights convergence of DBNN is difficult because of different local minimums. While with efficient initial value for network weights can avoided some local minimums. This method divides DBNN to multi single hidden layer and adjusts them, then we...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996